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Abstract — Interval analysis, when applied to the so called problem of experi- 
mental data fitting, appears to be still in its infancy. Sometimes, partly because 
of the unrivaled reliability of interval methods, we do not obtain any results at 
all. Worse yet, if this happens, then we are left in the state of complete ignorance 
concerning the unknown parameters of interest. This is in sharp contrast with 
widespread statistical methods of data analysis. In this paper I show the connec- 
tions between those two approaches: how to process experimental data rigorously, 
using interval methods, and present the final results either as intervals (guaran- 
teed, rigorous results) or in a more familiar probabilistic form: as a mean value 
and its standard deviation. 

This article is a companion paper to [IJ and is meant to be its extension, but otherwise 
it is self-contained. This is why we don't repeat everything here, except for the most 
important thing: a correct way to bound the distances between uncertain experimental 
values and the corresponding theoretical predictions of thereof. 

1 The goals of experimental data processing 

The problem in front of us may be stated as follows. We have N experimental data 
points, labelled as mi, . . . , niAr (measurements), each one obtained in different conditions 
Xj, J = 1,...,A'^, (called environments from now on), so that each nij = mj(xj). In 
addition, we have a theory, T, predicting the behavior of the investigated phenomenon in 
various environments. T is characterized by fc (fc < N) unknown parameters, Pi, . . . Pfe, 
so formally we can write: T(pi, . . . ,pfc,Xj) — tj. In words: when the (yet) unknown 
parameters have values pi,...,pfc respectively, and the environment state is Xj, the 
T predicts the observed outcome as tj. All quantities typeset in boldface are interval 
objects, usually just intervals, but they may be interval vectors as well. Contrary to the 
earlier theoretical attempts (for the relevant references see the literature cited in [1]) we 
no longer insist that experimental intervals nij are guaranteed, i.e. that they contain the 
true values with probability equal exactly to 1, nevertheless they may have this property. 

There are essentially two goals addressed by uncertain data processing: 

• to determine the values of interesting parameters, Pi, . . . Pk, best of all together 
with their uncertainties, or 

• to test whether a given model of phenomenon under study (theory T) is adequate. 



We will not go into hypothesis testing but instead will concentrate on finding unknown 
parameters given the uncertain experimental information. 



2 How do we find 'best fitted' parameters? 



In [T] we put forward the idea that the so called 'best fits' should be based on the distance 
between measured and theoretical values. In one dimension, when we compare a single 
result of measurement with the predicted one, and at least one of those quantities is an 
interval, the mathematically correct distance is the one valid in the interval space IK.. 
Starting with the familiar Moore-Hausdorff distance fl[ , usually written as 



d (a, b) — max (| a — | 
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we finally arrived at the tight interval estimate, p{t,m), of the distance between the 
theoretical prediction t and the unknown true result of a measurement, hidden somewhere 
within the interval m: 



when c(t) G m: 



when c(t) ^ m: 



lower bound: p = ^w(t) 

upper bound: p = max [c? (t, to), d (t, m) ] 

lower bound: p = min [d (t, to), rf (t, m) ] 

upper bound: p — max [d (t, to), o? (t, to) ], 



(2) 



(3) 



where c(-) stands for the center of its interval argument, c(t) = 5 (t + i), and d{-, •) is a 
Moore-Hausdorff distance between intervals. 

Now, equipped with p, we can think about the distances in A^-dimensional spaces. 
They can be constructed, among other, as the counterparts of the so called Lp norms, 
generally defined as 



N 

i=i 



(4) 



Here every individual Xj is a distance measured along the j-th coordinate, and p is a fixed, 
positive real number. The most important are norms Li, L2, and Lao- Specifically we 
have: 



Li distance 



a.k.a. Manhattan metric or taxi driver metric: 

N 



Li(t,m) 



P(tj,mj-) 
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This norm is used most often when we suspect the presence of outliers in experi- 
mental data set. The corresponding classical procedure bears the name LAD (Least 
Average/ Absolute Deviation) optimization. 



squared L2 norm — squared Euclidean distance: 



L^(t,m) 
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p(tj,mj 



(6) 



We have shown here L\ rather than just L2, in order to underline its close rela- 
tionship with familiar ^ functional. Minimization of it is well known, is 
an objective of the famous LSQ method. On the other hand, Li is a monotonous 
function of its positive arguments, so the minima of Li are located at the same 
arguments as minima of L\. 
• Loo or maximum distance — in interval analysis serves as box's diameter: 

Loo t,m)= max ^' (7) 
^ ' ' 3=1,- N w{mj) ^ ' 

This metric, in turn, is best applicable for calibration purposes [3]. Here the goal is 
to approximate uniformly the set of experimental points via any simple curve (or 
surface), not necessarily physically meaningful, but easy to evaluate. 

In classical data analysis every single functional shown above is treated differently 
than the remaining ones. Contrary, using interval methods, we need not to follow this 
path and develop procedures specific to each metric in turn. It is entirely possible to use 
exactly one and the same general purpose procedure to locate the global minimum of 
either functional. Such a procedure may be, for example, similar to that first described 
35 years ago by Skelboe and known as Moore-Skelboe algorithm. 



3 Troublesome interval output 

Regardless of the interval minimizer we shall use, the final outcome appears almost always 
troublesome. When the result is a single interval box, then the lengths of its edges are 
usually much larger than final uncertainties of the searched parameters as delivered by 
other methods. This is because such a result, being an interval hull of what was sought 
for, contains also many 'bad' solutions. In fact, the true solutions occupy only a little 
fraction of the volume returned by algorithm. Whatever the reason, our very reliable 
results simply look poorly, and are by no means competitive. 

At the other extreme, when our minimizing algorithm delivers many boxes - and by 
many we mean not two, three or even dozen boxes, but rather hundreds, or maybe even 
thousands of them - we are in troubles again. There is no simple way to present such 
results to other researchers in a simple, compact form, acceptable also by publishers. Of 
course, we can quickly calculate the convex hull (or hulls, if the set of returned boxes is 
not simply connected) of all boxes, but this takes us back to the previous situation. 

Even if our results happen to be quite narrow - shall we call them 'guaranteed?' 
Certainly not, whenever the input data have a form of the mean value and standard 
deviation, as it is most often the case. 

Hmmm. Let's think again. Suppose, we have quite a number of boxes covering some 
domain in parameter space, where the true solutions are located. Aren't those points the 



results of what is called 'indirect measurement?' Of course, they are! If so, then nothing 
can prevent us from treating them as usually and calculate their mean values, dispersion, 
etc. 



4 Means, Vciriances and correlations 

Suppose the outcome returned by a minimizing routine is a cluster of Ntox simply 
connected boxes Bj, j = 1, . . . , iVboa;, covering a single solution. How to calculate the 
"ordinarily" looking answers to our original problem? The number of our indirect mea- 
surements is no longer finite, as it takes place in direct measurements. It is even un- 
countably infinite, but this fact alone is no real obstacle. Just in place of various sums we 
will have to calculate some definite integrals, that's all. During calculations we have to 
assume that probability density is uniform in the interiors of all boxes. This position may 
seem strange at first sight (intervals can never be treated in this spirit!) but is entirely 
correct. The final formulae, valid when N^ox > 1, are following: 



mean values of unknown parameters — center of gravity of a cluster 

Po = -'- '^.-j: (8) 



E.^r [center (Bj) x Volume (Bj 
Efir Volume (B,) 



where now p denotes a real-valued, fc-dimensional vector of searched parameters, 
not their ranges: p = (pi, . . . , pk), and the subscript '0' indicates their mean (ex- 
pected) values. Of course, 'center(Bj)' is also a real-valued, fc-dimensional vector, 
pointing - you guessed -to the center of box Bj. The meaning of the number 
'Volume(Bj)' is self-explanatory. 

• dispersions (variances) of parameters 

We use the textbook definitions for the covariance of two multidimensional random 
variables X and Y, when their expected values, xq and yo, respectively, are known: 

Cov(XY) = ((X-xo)(Y-yo)), (9) 

where the braces '(•)' mean the average (expected) value. The variance of a random 
multidimensional variable can be computed on two equivalent ways, either as 

£72(X) = ((X - xo)2) or as ^^(X) = Cov(X • X). (10) 

In our case X = Y = (pi , . . . , p^) . If we denote the range (interval) of parameter 
as X, and the range of parameter p„ as y, both limited to current box under study, 
as indicated by the summation index j, then the off-diagonal elements of the co- 
variance matrix (m ^ n) are expressed as: 



Ef=r [i^ - 2xo) X - (x - 2xo) xj . [(y - 2yo) y - (y - 2yo) y 

C0Y{pmPn) = . ^jv,„ ' ' 



RVxy 



4Er=r Volume (Bj) 

(11) 



where Xq and yo are mean values of Pm and p„, respectively, as computed ear- 
lier from ([8]). Newly introduced symbol RVxy means 'reduced volume', that is the 
volume of fc — 2 dimensional box containing all parameters except Pm and p„: 



RVxy — 



(12) 



For diagonal elements of the covariance matrix, when m — n, we have instead: 



Y,"lT 5c^ + - SxqX + x^ - SxqX + 3xg X Volumc(Bj; 
L Jj 

3Ef=rVolume(Bj) 



(13) 



• correlations between parameters 

According to any textbook on statistics, coefficient of correlation between any two 
multidimensional random variables is defined as: 

Cov(XY) 

" (7(X) • a(Y) ^ > 

There should be no problem with calculating this quantity when we already have 
all necessary ingredients, obtained from ((HI), (fTT|) and (|13p. 



5 Discussion 

Omission of the case iV^ox = 1 was deliberate. It is both easy and hopeless case. And 
here is why. Easy part consists in calculating the mean values of unknown parameters. 
They all are simply equal to the centers of corresponding ranges. It also easy to show 
that their dispersions have to be equal to halves of the widths of their ranges. One will be 
nevertheless strongly disappointed with correlations between parameters: they are none, 
equal exactly to zero. But could all this be true? Certainly not. 

The natural question is how accurate are the suggested here results. The boxes com- 
prising the simply connected set covering the domain of possible solutions are not all 
created equal. Some of are them completely filled with the possible solutions, while oth- 
ers, those located at the boundaries, are filled with solutions only in part. This must 
necessarily affect our results, since those were derived with only the first kind of boxes in 
mind. It is intuitively clear that the more boxes we have, and the smaller they are, the 
'filling factor' will be closer to 100%. Consequently, our results will be closer to reality. 
All we can say is that the dispersions should come out always overestimated. For the 
cases where both input data and the theory are correct, that is. In statistical language 
we may say that our estimate of dispersions (or variances, if you prefer) is consistent but 
positively biased. Fortunately, this makes no harm. 

Quite a different story concerns covariances and correlations. As we could see, our 
ignorance in that matter remains completely intact, when we have at our disposal only 
a single box. Of course, increasing the number of boxes will take us closer to the true 



values. In case of off-diagonal elements of covariance/correlation matrix we have no guar- 
antee that convergence will be one-sided. This brings us to the question how many boxes 
do we really need'? The exhaustive answer to this problem is beyond current author's ca- 
pabilities. One may hope, with analogy to other statistical problems, that sensible results 
should start to appear when A^box exceeds, say 20. Fortunately, the optimizing routine 
usually delivers much more boxes, counted in hundreds. 

We haven't discussed the question of complexity in this paper. From what was said, it 
is clear that better, more accurate results, are also more costly than just rough estimates: 
depending on whether we are working with a single box or with many boxes. 

6 Conclusions 

Interval-oriented routines not only generate reliable estimates of unknown parameters 
as a resiilt of uncertain data processing. So obtained results can be safely and reliably 
'translated' into more widespread statistical form of presentation. 

Interval perspective sheds completely new light on experimental data processing. Here 
we sec with details what is in reality going on. Moreover, in many cases interval meth- 
ods allow for objective estimates of accuracies, with no need for human experts (who 
sometimes err very much in their estimates). 
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